Semantic-based Malay-English Translation using N-Gram Model
نویسندگان
چکیده
Most of the existing machine translations are based on word-for-word translation. The major obstacle in developing such a system is natural language is not free from ambiguity problems. One word may have more than one semantic, and vice versa. Herein, we propose a semantic-based Malay-English translation using an n-gram model. The Malay-English translation is not a word-for-word basis but is dependent on the semantic meaning of the Malay phrase. In particular, a bigram is used to approximate the probability of a word by using the conditional probability of the preceding word. For this study, whenever the semantic ambiguity occurs, the English word with the highest probability value is chosen to translate the Malay word (or 2-sequence Malaysia word). The proposed technique has been tested with three categories of sentences namely easy, moderate and complex. The performance of the proposed MalayEnglish translation is based on human judgement that demonstrates an averaged validity ratio of positive value. The positive value indicates that at least half of the respondents agreed that the translation outputs are at least “still make sense semantically”. The contribution of the proposed method can be ascribed to the enhancement of word-for-word translation for solving the ambiguity issue in Malay-English translation.
منابع مشابه
A first step design in integrating an English-Malay Translation Memory System into the Semantic Web
This paper discusses on the first step designing integration Semantic Web into our English-Malay Translation Memory (TM) prototype system. The main research activity that we need to perform is on how we can design and construct the ontology database in the architecture of the semantic web environment. Research also need to be carried out by understanding precisely the layer-cake architecture of...
متن کاملBilingual-LSA Based LM Adaptation for Spoken Language Translation
We propose a novel approach to crosslingual language model (LM) adaptation based on bilingual Latent Semantic Analysis (bLSA). A bLSA model is introduced which enables latent topic distributions to be efficiently transferred across languages by enforcing a one-to-one topic correspondence during training. Using the proposed bLSA framework crosslingual LM adaptation can be performed by, first, in...
متن کاملEnglish-Persian Plagiarism Detection based on a Semantic Approach
Plagiarism which is defined as “the wrongful appropriation of other writers’ or authors’ works and ideas without citing or informing them” poses a major challenge to knowledge spread publication. Plagiarism has been placed in four categories of direct, paraphrasing (rewriting), translation, and combinatory. This paper addresses translational plagiarism which is sometimes referred to as cross-li...
متن کاملSemantic Adequacy in Translation: Strategies employed in the English renderings of Sa'di's wittical remarks of The Rose Garden (Golistan)
Translating literary works is a difficult task, especially when it comes to cultural elements. It gets more difficult when words have ambiguities and multiple layers of meaning. The present study sought to examine the adequacy of witticism in the English renderings of Sa'di's clever remarks in Golistan (The Rose Garden). To this purpose, the researchers selected three English translations of Go...
متن کاملSemantic Adequacy in Translation: Strategies employed in the English renderings of Sa'di's wittical remarks of The Rose Garden (Golistan)
Translating literary works is a difficult task, especially when it comes to cultural elements. It gets more difficult when words have ambiguities and multiple layers of meaning. The present study sought to examine the adequacy of witticism in the English renderings of Sa'di's clever remarks in Golistan (The Rose Garden). To this purpose, the researchers selected three English translations of Go...
متن کامل